Unicode character - definição. O que é Unicode character. Significado, conceito
Diclib.com
Dicionário ChatGPT
Digite uma palavra ou frase em qualquer idioma 👆
Idioma:

Tradução e análise de palavras por inteligência artificial ChatGPT

Nesta página você pode obter uma análise detalhada de uma palavra ou frase, produzida usando a melhor tecnologia de inteligência artificial até o momento:

  • como a palavra é usada
  • frequência de uso
  • é usado com mais frequência na fala oral ou escrita
  • opções de tradução de palavras
  • exemplos de uso (várias frases com tradução)
  • etimologia

O que (quem) é Unicode character - definição

WIKIMEDIA LIST ARTICLE
Mapping of Unicode characters; List of Unicode ranges; Surrogate pair; Mapping of Unicode blocks; Mapping of Unicode graphic characters; Unicode character; Unicode range; Unicode codepage; Universal Character Set Characters; Unicode characters; High Surrogates; High Private Use Surrogates; Low Surrogates; Universal Character Set character; High PU Surrogates; High Surrogates (Unicode block); High Private Use Surrogates (Unicode block); Low Surrogates (Unicode block); Surrogate mechanism; Surrogate code point; High surrogate; Low surrogate; Surrogate code points; Noncharacter; Surrogates in Unicode; Non-character; ﷐; ﷑; ﷒; ﷓; ﷔; ﷕; ﷖; ﷗; ﷘; ﷙; ﷚; ﷛; ﷜; ﷝; ﷞; ﷟; ﷠; ﷡; ﷢; ﷣; ﷤; ﷥; ﷦; ﷧; ﷨; ﷩; ﷪; ﷫; ﷬; ﷭; ﷮; ﷯; 􏿾; 􏿿; 󿿿; 󿿾; 󯿿; 󯿾; 󟿿; 󟿾; 󏿿; 󏿾; 򿿿; 򿿾; 򯿿; 򯿾; 򟿿; 򟿾; 򏿿; 򏿾; 񿿿; 񿿾; 񯿿; 񯿾; 񟿿; 񟿾; 񏿿; 񏿾; 𿿿; 𿿾; 𯿿; 𯿾; 🿿; 🿾
  • Apple Chancery]]) shows the synthesized common fraction on the left and the precomposed fraction glyph on the right as a rendering the plain text string "1 1⁄4 1¼". Depending on the text environment, the single string "1 1⁄4" might yield either result, the one on the right through substitution of the fraction sequence with the single precomposed fraction glyph.
  • Apple Chancery]]. This font supplies the text layout software with instructions to synthesize the fraction according to the [[Unicode]] rule described in this section.
  • 150px
  • 3}}<sup>b</sup> Limited.}}
   }}

Unicode control characters         
NON-PRINTING FORMAT EFFECTORS AND CONTROL CODES INCLUDED IN UNICODE
␁; ␂; ␐; ␙; ␜; ␝; ␞; ␟; ; ; ; Unicode control character; Bidirectional control characters
Many Unicode characters are used to control the interpretation or display of text, but these characters themselves have no visual or spatial representation. For example, the null character ( NULL) is used in C-programming application environments to indicate the end of a string of characters.
Unicode         
COMPUTING INDUSTRY STANDARD FOR THE CONSISTENT ENCODING, REPRESENTATION AND HANDLING OF TEXT EXPRESSED IN MOST OF THE WORLD'S WRITING SYSTEMS
UniCode; Unicode Transformation Format; Unicode.org; Unicode Standard; Unicode pipeline; Unicode Pipeline; The Unicode Standard; Unicode roadmap; MES-1; MES-2; Unicode transformation format; Unicode 5.0; U+; Yunicode; Unicode 5.1; Unicode 5.2; Brakcet; Unicode 8; UNICODE; Uni-code; Unicode anomaly; Unicode 6.0; Unicode codepoint; Unicode 6.1; Unicode 6.2; Unicode versions; Unicode code points; Unicode alias; Unicode 6.3; Unicode 7.0; Unicode Transformation Formats; Unicode 6; Unicode 1.0.0; Unicode 1.0.1; Unicode 1.1; Unicode 2.0; Unicode 2.1; Unicode 3.0; Unicode 3.1; Unicode 3.2; Unicode 4.0; Unicode 4.1; Unicode 8.0; Unicode 1.0; Unicode 1; Unicode 2; Unicode 3; Unicode 4; Unicode 5; Unicode 7; Unicode 9.0; Unicode 9; Unicode 88; Unicode 10; Unicode 10.0; Unicode 9.0.0; Unicode 10.0.0; Unicode 8.0.0; Unicode 7.0.0; Unicode 6.0.0; Unicode 5.0.0; Unicode 4.0.0; Unicode 3.0.0; Unicode 2.0.0; Unicode code point; Unicode 11; Unicode 11.0; Unicode 12; Unicode 12.0; Multilingual European subsets; Multilingual European subset; Bulldog Award; Unicode 12.1; Script Encoding Initiative; Unicode 13; Unicode 13.0; Unicode notation; Unicode 12.0.0; Unicode 13.0.0; Unicode 11.0.0; Unicode 3.1.1; Unicode 3.0.1; Unicode 14.0.0; Unicode 14.0; Unicode 14; Unicode 12.1.0; Unicode 1.1.0; Unicode 1.1.5; Unicode 2.1.0; Unicode 3.1.0; Unicode 4.0.1; Unicode 4.1.0; Unicode 5.1.0; Unicode 5.2.0; Unicode 6.1.0; Unicode 6.2.0; Unicode 6.3.0; Unicode 3.2.0; Unicode 2.1.5; Unicode 2.1.8; Unicode 2.1.9; Unicode 2.1.2; The Unicode Bulldog Award; Unicode Bulldog Award; Unicode 15; Unicode 15.0; Unicode 15.0.0; Unicode Character Set; Unicode Version History
1. <character> A 16-bit character set standard, designed and maintained by the non-profit consortium Unicode Inc. Originally Unicode was designed to be universal, unique, and uniform, i.e., the code was to cover all major modern written languages (universal), each character was to have exactly one encoding (unique), and each character was to be represented by a fixed width in bits (uniform). Parallel to the development of Unicode an ISO/IEC standard was being worked on that put a large emphasis on being compatible with existing character codes such as ASCII or ISO Latin 1. To avoid having two competing 16-bit standards, in 1992 the two teams compromised to define a common character code standard, known both as Unicode and BMP. Since the merger the character codes are the same but the two standards are not identical. The ISO/IEC standard covers only coding while Unicode includes additional specifications that help implementation. Unicode is not a glyph encoding. The same character can be displayed as a variety of glyphs, depending not only on the font and style, but also on the adjacent characters. A sequence of characters can be displayed as a single glyph or a character can be displayed as a sequence of glyphs. Which will be the case, is often font dependent. See also Jürgen Bettels and F. Avery Bishop's paper {Unicode: A universal character code (http://research.compaq.com/wrl/DECarchives/DTJ/DTJB02/DTJB02SC.TXT)}. (2002-08-06) 2. <language> A pre-Fortran on the IBM 1130, similar to MATH-MATIC. [Sammet 1969, p.137]. (2004-09-14)
Universal Character Set characters         
The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character Set. The Universal Coded Character Set, most commonly called the Universal Character Set ( UCS, official designation: ISO/IEC 10646), is an international standard to map characters, discrete symbols used in natural language, mathematics, music, and other domains, to unique machine-readable data values.

Wikipédia

Universal Character Set characters

The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character Set. The Universal Coded Character Set, most commonly called the Universal Character Set (abbr. UCS, official designation: ISO/IEC 10646), is an international standard to map characters, discrete symbols used in natural language, mathematics, music, and other domains, to unique machine-readable data values. By creating this mapping, the UCS enables computer software vendors to interoperate, and transmit—interchange—UCS-encoded text strings from one to another. Because it is a universal map, it can be used to represent multiple languages at the same time. This avoids the confusion of using multiple legacy character encodings, which can result in the same sequence of codes having multiple interpretations depending on the character encoding in use, resulting in mojibake if the wrong one is chosen.

UCS has a potential capacity of over 1 million characters. Each UCS character is abstractly represented by a code point, an integer between 0 and 1,114,111 (1,114,112 = 220 + 216 or 17 × 216 = 0x110000 code points), used to represent each character within the internal logic of text processing software. As of Unicode 15.0, released in September 2022, 293,168 (26%) of these code points are allocated, 149,251 (13%) have been assigned characters, 137,468 (12.3%) are reserved for private use, 2,048 are used to enable the mechanism of surrogates, and 66 are designated as noncharacters, leaving the remaining 820,944 (74%) unallocated. The number of encoded characters is made up as follows:

  • 149,014 graphical characters (some of which do not have a visible glyph, but are still counted as graphical)
  • 237 special purpose characters for control and formatting.

ISO maintains the basic mapping of characters from character name to code point. Often, the terms character and code point will be used interchangeably. However, when a distinction is made, a code point refers to the integer of the character: what one might think of as its address. Meanwhile, a character in ISO/IEC 10646 includes the combination of the code point and its name, Unicode adds many other useful properties to the character set, such as block, category, script, and directionality.

In addition to the UCS, the supplementary Unicode Standard, (not a joint project with ISO, but rather a publication of the Unicode Consortium,) provides other implementation details such as:

  1. mappings between UCS and other character sets
  2. different collations of characters and character strings for different languages
  3. an algorithm for laying out bidirectional text ("the BiDi algorithm"), where text on the same line may shift between left-to-right ("LTR") and right-to-left ("RTL")
  4. a case-folding algorithm

Computer software end users enter these characters into programs through various input methods, for example, physical keyboards or virtual character palettes.

The UCS can be divided in various ways, such as by plane, block, character category, or character property.